Fixing a Broken ELBO
نویسندگان
چکیده
We present an information-theoretic framework for understanding trade-offs in unsupervised learning of deep latent-variables models using variational inference. This framework emphasizes the need to consider latent-variable models along two dimensions: the ability to reconstruct inputs (distortion) and the communication cost (rate). We derive the optimal frontier of generative models in the two-dimensional rate-distortion plane, and show how the standard evidence lower bound objective is insufficient to select between points along this frontier. However, by performing targeted optimization to learn generative models with different rates, we are able to learn many models that can achieve similar generative performance but make vastly different trade-offs in terms of the usage of the latent variable. Through experiments on MNIST and Omniglot with a variety of architectures, we show how our framework sheds light on many recent proposed extensions to the variational autoencoder family.
منابع مشابه
Filtering Variational Objectives
When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results. Inspired by this, we consider the extension of the ELBO to a family of lower bounds defined by a particle filter’s estimator of the marginal likelihood, the filtering variational objectives (FIVOs). FIVOs take the same arguments as th...
متن کاملBRST gauge fixing and the algebra of global supersymmetry
A global supersymmetry (SUSY) in supersymmetric gauge theory is generally broken by gauge fixing. A prescription to extract physical information from such SUSY algebra broken by gauge fixing is analyzed in path integral framework. If δSUSY δBRSTΨ = δBRST δSUSY Ψ for a gauge fixing “fermion” Ψ, the SUSY charge density is written as a sum of the piece which is naively expected without gauge fixin...
متن کاملSticking the landing: A simple reduced-variance gradient for ADVI
Compared to the REINFORCE gradient estimator, the reparameterization trick usually gives lower-variance estimators. We propose a simple variant of the standard reparameterized gradient estimator for the evidence lower bound that has even lower variance under certain circumstances. Specifically, we decompose the derivative with respect to the variational parameters into two parts: a path derivat...
متن کاملSupplementary Material: Personalizing Gesture Recognition Using Hierarchical Bayesian Neural Networks
For all three datasets, on fifteen random 75-25 split of the data, we trained a Hierarchical Bayesian Neural Network for 100 epochs, with and without using local reparameterization. When not using local reparameterization, we approximated the ELBO using 20 Monte Carlo samples whereas when using local reparameterization, we only used 1 sample. We plot the mean logarithm of the ELBO versus the nu...
متن کامل